Representability of algebraic topology for biomolecules in machine learning based scoring and virtual screening
نویسندگان
چکیده
This work introduces a number of algebraic topology approaches, including multi-component persistent homology, multi-level persistent homology, and electrostatic persistence for the representation, characterization, and description of small molecules and biomolecular complexes. In contrast to the conventional persistent homology, multi-component persistent homology retains critical chemical and biological information during the topological simplification of biomolecular geometric complexity. Multi-level persistent homology enables a tailored topological description of inter- and/or intra-molecular interactions of interest. Electrostatic persistence incorporates partial charge information into topological invariants. These topological methods are paired with Wasserstein distance to characterize similarities between molecules and are further integrated with a variety of machine learning algorithms, including k-nearest neighbors, ensemble of trees, and deep convolutional neural networks, to manifest their descriptive and predictive powers for protein-ligand binding analysis and virtual screening of small molecules. Extensive numerical experiments involving 4,414 protein-ligand complexes from the PDBBind database and 128,374 ligand-target and decoy-target pairs in the DUD database are performed to test respectively the scoring power and the discriminatory power of the proposed topological learning strategies. It is demonstrated that the present topological learning outperforms other existing methods in protein-ligand binding affinity prediction and ligand-decoy discrimination.
منابع مشابه
Communication-Aware Traffic Stream Optimization for Virtual Machine Placement in Cloud Datacenters with VL2 Topology
By pervasiveness of cloud computing, a colossal amount of applications from gigantic organizations increasingly tend to rely on cloud services. These demands caused a great number of applications in form of couple of virtual machines (VMs) requests to be executed on data centers’ servers. Some of applications are as big as not possible to be processed upon a single VM. Also, there exists severa...
متن کاملA Machine Learning Approach to Enhance Scoring Performance in Docking-Based Virtual Screening Experiments: COX-1 as a Case Study
Molecular docking can be reasonably successful at reproducing X-ray poses of a ligand in the binding site of a protein. However, scoring functions are typically unsuccessful at correctly ranking ligands according to their binding affinity. Using cyclooxygenase-1 (COX-1), a particularly challenging workhorse in virtual screening (VS) we show how the use of support vector machines (SVMs), trained...
متن کاملBeware of Machine Learning-Based Scoring Functions - On the Danger of Developing Black Boxes
Training machine learning algorithms with protein-ligand descriptors has recently gained considerable attention to predict binding constants from atomic coordinates. Starting from a series of recent reports stating the advantages of this approach over empirical scoring functions, we could indeed reproduce the claimed superiority of Random Forest and Support Vector Machine-based scoring function...
متن کاملAn Intelligent Machine Learning-Based Protection of AC Microgrids Using Dynamic Mode Decomposition
An intelligent strategy for the protection of AC microgrids is presented in this paper. This method was halving to an initial signal processing step and a machine learning-based forecasting step. The initial stage investigates currents and voltages with a window-based approach based on the dynamic decomposition method (DDM) and then involves the norms of the signals to the resultant DDM data. T...
متن کاملIdentification Psychological Disorders Based on Data in Virtual Environments Using Machine Learning
Introduction: Psychological disorders is one of the most problematic and important issue in today's society. Early prognosis of these disorders matters because receiving professional help at the appropriate time could improve the quality of life of these patients. Recently, researches use social media as a form of new tools in identifying psychological disorder. It seems that through the use of...
متن کامل